Skip to content

Improve resilience and performance of do_bulk_inference #128

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 15 commits into from
Jun 17, 2022

Conversation

mhaas
Copy link
Member

@mhaas mhaas commented May 19, 2022

In case of errors, the InferenceClient.do_bulk_inference method
will now return None for the affected objects instead of aborting
the entire bulk inference operation (and discarding any successfully
processed objects).

Fixes issue #68

The fix for #68 is different than what is described in #68. Instead of
using a generator based approach which will require the SDK consumer to
implement the error handling themselves, the SDK itself now handles the
errors. The downside of not using a generator is a larger memory footprint
to accumulate the results in a list. As an alternative, we can consider
using a generator to either yield the successfully processed inference
results or the list containing None. This approach will save memory.

Additionally, this commit introduces parallel processing in InferenceClient.do_bulk_inference.
This will greatly improve performance. Due to the non-lazy implementation of
ThreadPoolProcessor.map, this increases memory usage slightly (cpython issue #74028)

Checks:

  • README.md is maintained
  • Documentation is maintained
  • CHANGELOG.md is updated
  • Docstring maintained
  • Version number increased
  • Tests added

mhaas added 12 commits May 19, 2022 18:13
In case of errors, the `InferenceClient.do_bulk_inference` method
will now return `None` for the affected objects instead of aborting
the entire bulk inference operation (and discarding any successfully
processed objects).

Fixes issue #68

The fix for #68 is different than what is described in #68. Instead of
using a generator based approach which will require the SDK consumer to
implement the error handling themselves, the SDK itself now handles the
errors. The downside of not using a generator is a larger memory footprint
to accumulate the results in a list. As an alternative, we can consider
using a generator to either yield the successfully processed inference
results or the list containing `None`. This approach will save memory.

Additionally, this commit introduces parallel processing in `InferenceClient.do_bulk_inference`.
This will greatly improve performance. Due to the non-lazy implementation of
`ThreadPoolProcessor.map`, this increases memory usage slightly ([cpython issue #74028])

[cpython issue #74028]: python/cpython#74028
This also contains a documentation formatting fix.
setuptools must be installed much earlier in the process.
@mhaas mhaas requested a review from SreevishnuAB June 17, 2022 07:59
Copy link
Contributor

@SreevishnuAB SreevishnuAB left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

The bulk_inference code is now multithreaded. For this reason, the trick of
returning different values in the Mock based on the order of the calls no longer works.
This was somewhat accidentally working on CPython, but not on pypy.
@mhaas mhaas force-pushed the bulk_inference_resilience branch from 264db6f to accc46b Compare June 17, 2022 12:55
mhaas added 2 commits June 17, 2022 15:00
This is mainly useful to fix the tests which rely on the
mocks being called in a certain order. One of the tests supports
concurrency by mocking in a better way, but this was not feasible
for the other tests.

This commit also updates the documentation build tools to the
latest version to fix the documentation build on my local machine.
@mhaas mhaas force-pushed the bulk_inference_resilience branch from 31da951 to 5bbf4a2 Compare June 17, 2022 15:36
@mhaas mhaas merged commit 68cf17a into main Jun 17, 2022
@mhaas mhaas deleted the bulk_inference_resilience branch June 17, 2022 15:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants